Handling of Missing Data Induced by Time-Varying Covariates in Comparative Effectiveness Research HIV Patients [Methods Study], 2013-2018 (ICPSR 39528)

Version Date: Oct 9, 2025 View help for published

Principal Investigator(s): View help for Principal Investigator(s)
Manisha Desai, Stanford University

https://doi.org/10.3886/ICPSR39528.v1

Version V1

Slide tabs to view more

Researchers can use data from health registries or electronic health records to compare two or more treatments. Registries store data about patients with a specific health problem. These data include how well those patients respond to treatments and information about patient traits, such as age, weight, or blood pressure. But sometimes data about patient traits are missing.

Missing data about patient traits can lead to incorrect study results, especially when traits change over time. For example, weight can change over time, and the patient may not report their weight at some points along the way. Researchers use statistical methods to fill in these missing data.

In this study, the research team compared a new statistical method to fill in missing data with traditional methods. Traditional methods remove patients with missing data or fill in each missing number with a single estimate. The new method creates multiple possible estimates to fill in each missing number.

To access the methods, software, and R package, please visit the SimulateCER GitHub and SimTimeVar CRAN website.

Desai, Manisha. Handling of Missing Data Induced by Time-Varying Covariates in Comparative Effectiveness Research HIV Patients [Methods Study], 2013-2018. Inter-university Consortium for Political and Social Research [distributor], 2025-10-09. https://doi.org/10.3886/ICPSR39528.v1

Export Citation:

  • RIS (generic format for RefWorks, EndNote, etc.)
  • EndNote
Patient-Centered Outcomes Research Institute (PCORI) (ME-1303-5989)
Inter-university Consortium for Political and Social Research
Hide

2013 -- 2018
Hide

To evaluate statistical approaches for handling missing data in longitudinal studies of comparative effectiveness research through the following goals: (1) the creation of a tool to simulate studies observed in CER for method evaluation, (2) how to perform MI of derived predictors such as interactions, (3) how to perform MI of outcomes derived from repeated measures, and (4) performances of MI and commonly applied approaches when describing relationships between time-varying covariates with and without missing values and a right-censored outcome.

The research team conducted 3 simulation studies to address key questions in which they assessed performance using metrics including mean squared error, bias, and standard errors. To investigate imputation of interactions, the team evaluated active and passive MI strategies, in which active involves imputing the interaction term as if it were any other variable and passive involves deriving--and not imputing--the term only after imputing the main effects (ie, by simply taking the product of the main effects). They evaluated these approaches under the joint modeling (JM) approach, in which a joint parametric distribution is assumed for the imputation model, and the fully conditional specification (FCS) approach, in which specification of a joint model is bypassed, and, instead, conditional models for each variable are assumed. The team assessed similar approaches when addressing the imputation of an outcome, rate of change, when a 2-stage linear model was employed. Finally, the team investigated commonly applied methods including complete case (CC) and single imputation (SI) for handling missing time_x0002_varying covariates and MI that ignores the clustered data structure (MI Naïve). For the latter, they established a comprehensive R package to simulate data including time-varying covariates with a complex correlation structure to represent realistic CER studies.

Simulated data that resemble the complexity of an empirical study to identify antiretroviral therapies associated with increased risk of cardiovascular disease among patients with HIV

Hide

2025-10-09

Hide

Notes

  • The public-use data files in this collection are available for access by the general public. Access does not require affiliation with an ICPSR member institution.

pcodr logo

This study is maintained and distributed by the Patient-Centered Outcomes Data Repository (PCODR). PCODR is the official data repository of the Patient-Centered Outcomes Research Initiative (PCORI).